Data Stream Warehousing In Tidalrace

نویسندگان

  • Theodore Johnson
  • Vladislav Shkapenyuk
چکیده

Big data is a ubiquitous feature of large modern enterprises. Many organizations generate huge amounts of on-line streaming data – examples include network monitoring, Twitter feeds, financial data, and industrial application monitoring. Making effective use of these data streams can be challenging. While Data Stream Management Systems can provide support for realtime alerting and data reduction, many applications require complex analytics on a data history to best make use of the streams. We have been developing technologies for data stream warehousing, starting with the DataDepot [13] system. A data stream warehouse continually ingests data streams, computes complex derived data products, and stores long (perhaps yearslong) histories. To take advantage of new technologies, we have developed a next-generation data stream warehousing system. In this paper we describe the Tidalrace system, our motivations for developing it, and architectural features of Tidalrace that support data stream warehousing.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Efficient Stream-based Join to Process End User Transactions in Real-Time Data Warehousing

In the field of real-time data warehousing semistream processing has become a potential area of research since last one decade. One important operation in semi-stream processing is to join stream data with a slowly changing diskbased master data. A join operator is usually required to implement this operation. This join operator typically works under limited main memory and this memory is gener...

متن کامل

Lahar Demonstration: Warehousing Markovian Streams

Lahar is a warehousing system for Markovian streams—a common class of uncertain data streams produced via inference on probabilistic models. Example Markovian streams include text inferred from speech, location streams inferred from GPS or RFID readings, and human activity streams inferred from sensor data. Lahar supports OLAP-style queries on Markovian stream archives by leveraging novel appro...

متن کامل

Optimizing Queue-Based Semi-Stream Joins with Indexed Master Data

In Data Stream Management Systems (DSMS) semi-stream processing has become a popular area of research due to the high demand of applications for up-to-date information (e.g. in real-time data warehousing). A common operation in stream processing is joining an incoming stream with disk-based master data, also known as semi-stream join. This join typically works under the constraint of limited ma...

متن کامل

Histograms for OLAP and Data-Stream Queries

Histograms are an important tool for data reduction both in the field of data-stream querying and in OLAP, since they allow us to represent large amount of data in a very compact structure, on which both efficient mining techniques and OLAP queries can be executed. Significant timeand memory-cost advantages may derive from data reduction, but the trade-off with the accuracy has to be managed in...

متن کامل

Database Research at UT Arlington ( ITLab @ CSE . UTA )

The Information Technology Laboratory (or ITLab) at the Computer Science and Engineering Department at The University of Texas at Arlington was established by Sharma Chakravarthy in Spring 2000. The mission of the ITLab is to conduct research and development on all aspects of information technology. Some of the topics currently being investigated are: Data Warehousing/Information Integration, D...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015